AITopics | goal-conditioned rl agent

7b35a69f434b5eb07ed1b1ef16ace52c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 08:54:46 GMT

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Vienna (0.14)
Asia > Middle East > Republic of Türkiye > Aksaray Province > Aksaray (0.04)
North America > United States (0.04)
(2 more...)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)

Add feedback

Instructing Goal-Conditioned Reinforcement Learning Agents with Temporal Logic Objectives

Neural Information Processing SystemsDec-26-2025, 05:05:49 GMT

Goal-conditioned reinforcement learning (RL) is a powerful approach for learning general-purpose skills by reaching diverse goals. However, it has limitations when it comes to task-conditioned policies, where goals are specified by temporally extended instructions written in the Linear Temporal Logic (LTL) formal language. Existing approaches for finding LTL-satisfying policies rely on sampling a large set of LTL instructions during training to adapt to unseen tasks at inference time. However, these approaches do not guarantee generalization to out-of-distribution LTL objectives, which may have increased complexity. In this paper, we propose a novel approach to address this challenge. We show that simple goal-conditioned RL agents can be instructed to follow arbitrary LTL specifications without additional training over the LTL task space. Unlike existing approaches that focus on LTL specifications expressible as regular expressions, our technique is unrestricted and generalizes to $\omega$-regular expressions. Experiment results demonstrate the effectiveness of our approach in adapting goal-conditioned RL agents to satisfy complex temporal logic task specifications zero-shot.

instructing goal-conditioned reinforcement learning agent, name change, temporal logic objective, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Neural Information Processing SystemsDec-23-2025, 17:31:38 GMT

Deep Reinforcement Learning (RL) is successful in solving many complex Markov Decision Processes (MDPs) problems. However, agents often face unanticipated environmental changes after deployment in the real world. These changes are often spurious and unrelated to the underlying problem, such as background shifts for visual input agents. Unfortunately, deep RL policies are usually sensitive to these changes and fail to act robustly against them. This resembles the problem of domain generalization in supervised learning. In this work, we study this problem for goal-conditioned RL agents. We propose a theoretical framework in the Block MDP setting that characterizes the generalizability of goal-conditioned policies to new environments. Under this framework, we develop a practical method PA-SkewFit that enhances domain generalization. The empirical evaluation shows that our goal-conditioned RL agent can perform well in various unseen test environments, improving by 50\% over baselines.

goal-conditioned block mdp, learning domain invariant representation, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

7b35a69f434b5eb07ed1b1ef16ace52c-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 23:02:25 GMT

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Vienna (0.14)
Asia > Middle East > Republic of Türkiye > Aksaray Province > Aksaray (0.04)
North America > United States (0.04)
(2 more...)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)

Add feedback

Instructing Goal-Conditioned Reinforcement Learning Agents with Temporal Logic Objectives

Neural Information Processing SystemsJan-19-2025, 09:58:16 GMT

Goal-conditioned reinforcement learning (RL) is a powerful approach for learning general-purpose skills by reaching diverse goals. However, it has limitations when it comes to task-conditioned policies, where goals are specified by temporally extended instructions written in the Linear Temporal Logic (LTL) formal language. Existing approaches for finding LTL-satisfying policies rely on sampling a large set of LTL instructions during training to adapt to unseen tasks at inference time. However, these approaches do not guarantee generalization to out-of-distribution LTL objectives, which may have increased complexity. In this paper, we propose a novel approach to address this challenge. We show that simple goal-conditioned RL agents can be instructed to follow arbitrary LTL specifications without additional training over the LTL task space.

goal-conditioned rl agent, instructing goal-conditioned reinforcement learning agent, temporal logic objective, (3 more...)

Neural Information Processing Systems

Genre: Research Report (0.44)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Neural Information Processing SystemsOct-9-2024, 10:16:19 GMT

Deep Reinforcement Learning (RL) is successful in solving many complex Markov Decision Processes (MDPs) problems. However, agents often face unanticipated environmental changes after deployment in the real world. These changes are often spurious and unrelated to the underlying problem, such as background shifts for visual input agents. Unfortunately, deep RL policies are usually sensitive to these changes and fail to act robustly against them. This resembles the problem of domain generalization in supervised learning.

goal-conditioned block mdp, goal-conditioned rl agent, learning domain invariant representation, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)

Add feedback

RL for Planning and Planning for RL

#artificialintelligenceApr-7-2020, 00:50:45 GMT

The figure above illustrates the method:(a) Goal-conditioned RL often fails to reach distant goals, but can successfully reach the goal if starting nearby (inside the green region). Reinforcement learning (RL) has seen a lot of progress over the past few years, tackling increasingly complex tasks. Much of this progress has been enabled by combining existing RL algorithms with powerful function approximators (i.e., neural networks). Function approximators have enabled researchers to apply RL to tasks with high-dimensional inputs without hand-crafting representations, distance metrics, or low-level controllers. However, function approximators have not come for free, and their cost is reflected in notoriously challenging optimization: deep RL algorithms are famously unstable and sensitive to hyperparameters.

algorithm, goal state, waypoint, (15 more...)

#artificialintelligence

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)

Add feedback